MACD 5

home *** CD-ROM | disk | FTP | other *** search

/ MACD 5 / MACD 5.bin / workbench / docs / asm_guide / assembler course / doctext-assembler < prev next >

Wrap

Text File | 1992-04-27 | 24.5 KB | 656 lines

INTRODUCTION ------------ If you ever read a book about assembler, You probably were chased by weird things like carryflag, status register, Program counter etc... Maybe you've read about 'Arithmatic & logical unit', about 'CPU' and 'STACK'. If so, just put it all aside for a while. Most things are very important for the understanding of a computer's way of working, but they have very little to do with coding itself. Most things will become clear while we're working,... If you have never coded anything in your life, you should seriously consider something easier to start with, like BASIC. It is rather important to know a bit about programming techniques, like loops, structured code, subroutines... before you start with assembler. Assembler is a second gerneration language, this means that there are not much tools to make life easy for the programmer. For example there's no 'print' instruction in assembler. This doesn't make the problem easier for a total beginner... A FIRST PROGRAM --------------- Here follows a first VERY easy program. We'll use this to build up our knowledge of assembler. Ofcourse, before you start making demos, you have to know Assembler. Don't worry if you don't get it yet, this will change very rapidly. Here we go: top: move.b #5,d0 line1: tst.b d0 beq end sub.b #1,d0 bra line1 end: rts GENERAL THINGS - VERY IMPORTANT -------------- - - - - - - - ** 1 In assembler, we mostly use Hexadecimal or binary numbers rather than the normal decimals. This is kinda confusing in the beginning, coz we're used to normal numbers, but in fact it's easy: let's try. For example the number 423, let's have a look... Think of the meaning of the position of the digits: the '4' in 423 means in fact 4x100 (= 4x 10^2), the 2 means 2x10 (=2x 10^1), the 3 means 3x1 (=3x 10^0) as you see, each position represents another power of 10. Each digit can have a value from 0 to 9. In binary notation, we have powers of 2, and each digit can have a value from 0 to 1. 100101 means in fact 1x2^5, 0x2^4, 0x2^3, 1x2^2, 0x2^1, 1x2^0. The same thing can be said about 'hexadecimal' numbers: they use powers of 16, each digit can have a value from 0 to 15. (0,1,...9,a,b,c,d,e,f) You can count for yourself: decimal 10 = hexadecimal a decimal 16 = hexadecimal 10 (1 x 16^1) decimal 2 = binary 10 (1 x 2^1) decimal 4 = binary 100 (1 x 2^2) decimal 5 = binary 101 (1x2^2 + 1x2^0) This is ofcourse nice to know, but you can hardly sit donw and count each value with powers-of-i-dunno-what, coz it might take HOURS !!!! Therefor, we have - luckyly - the ASSEMBLER !! If you use asmone (you probably do), you can easyly transfer any number from binary to hexadecimal to decimal, or back. You just tell asmone what kind of number you have for him, and he'll do the rest for you. For each special kind there is a special 'MARKER': BINARY NUMBERS have a '%' preceding: %1001101011 HEXADECIMAL NUMBERS use a '$' before them: $38a83b DECIMAL NUMBERS have nothing: 12353 $10 is something completely different than %10 or 10 !!! Later, when you make a program, you can enter any number in the notation you want: just put the correct prefix. You'll soon discover that in some cases, hexadecimal notation is prefered, in other cases binary notation might be more interesting... It depends. ** 2 Now about BYTES, WORDS and LONGWORDS. A computer's memory is built up of enourmous amounts of bits. These are in fact switches, can have 2 states: on or off, 1 or 0. To make it all a bit easier to handle, these bits are grouped: if you group 8 bits, you have a BYTE. 16 bits (2 bytes) are called a WORD, and 32 bits (2 words, 4 bytes) are called a LONGWORD. You can do things with these groups of bits, like putting a certain value into them. It's obvious you can move larger numbers in a LONGWORD than in a word or in a byte. Grouping all these bits into bytes, you get the memory, where each byte has it's own number-in-the-row. This number is the 'addres' of this byte. Address 1493 is in fact the 1493th byte in the memory. (Addresses are mostly expressed as HEXADECIMAL numbers, like $fc0000) Each address contains a value, made up by those bits. For example address number $10248 could contain the value 100 (just a silly example) In hexadecimal notation, the value of one byte in memory can be between #$00 and #$ff. (decimal: between #0 and #255) If you wish to take a word in memory, you just take 2 succesive bytes from memory (for example $10000 & $10001). We say in this case that the word is at $10000 (and not $10001) WORDS CAN ONLY START ON AN EVEN ADDRESS (so a word can not be for example on $10001 & $10002) If address $10000 contains the byte #$34 and address $10001 contains the byte #$a8, the WORD at $10000 contains #$34a8. Longwords have the same story: only at EVEN ADDRRESSES, but here you take 4 bytes in a row. Bytes #$10, #$3a, #$29, #$00 make the LONGWORD #$103a2900. You see, to switch between bytes, words and longwords, the hexadecimal notation is much easier than the decimal. If data in memory is used as WORD or BYTE or LONGWORD is not predefined. It just depends on how you wish to use it. ** 3 Now about the DATA- AND ADDRES-REGISTERS. These are zones in the processor where you can temporary store data. In the amiga, there are 8 data and 8 addressregisters: d0, d1,... d7 and a0,a1...a7. They're used VERY OFTEN. The special thing about it, and this is in fact an aspect of the hardware, is that they are directly connected to the PROCESSOR, where the memory is not. Therefor, the registers can be accessed much faster than memory. REGISTERS are also used to store the data that will be used for a mathematical calculation, and once this calculation is done, the results will be in the registers again. You can't for example subtract the value in address x from the value in address y, and put the results in address z, no, you must put the values in the registers first, then perform the subtr, and then you'll get the results in the same register. Examples follow in some lines. I've to tell you just one thing before that. ** 4 1) move.l #$1000,d0 2) move.l $1000,d0 Do you see the difference between these 2 lines? In line 1 is a '#' before the number. LINE 1 MEANS: PUT THE VALUE '$1000' IN DATAREGISTER D0 LINE 2 MEANS: PUT THE CONTENTS OF ADDRESS $1000 IN DATAREGISTER D0. This is ofcourse a big difference. If addres $1000 contained the value #$129475, in the second case, this value would be moved into D0. VALUES ALLWAYS HAVE A '#' PRECEDING. Addresses have nothing. ******* Now take back the small program. See the 'MOVE.B #5,d0' ? You ought to understand everything of it now: The '.B' means that you're gonna work with only 8 bits in one time. (BYTE) The # means that a VALUE is about to follow: the value is #5. There's no $ or % before the number, which means that the number is decimal. (note: decimal 5 and hexadecimal 5 are the same...) D0 is a dataregister in which the number will be moved. So: move.b #5,d0 means: move decimal value 5 into dataregister 0. The size of the moved value is 8 bits. (note: one register contains 32 bits) MOVE.L #$1000,d1 this means: move the hexadecimal value 1000 in datareg1. here you transfer a LONGWORD (32 bits) MOVE.W $2000,d1 now you move the WORD that is stored at address $2000 into datareg1. Let's say that the memory looks as follows: addr: $1ffe $1fff $2000 $2001 $2002 $2003 $2004 value: #$a4 #$4d #$00 #$48 #$29 #$00 #$35 The value moved to D0 will be in that case: #$0048 (1 word = 2 bytes, starting from $2000) If we moved a longword (move.l $2000,d0) the value would be #$00482900 I think the other lines are not to difficult. Now you know very much already. But there's much more to come... You see that we moved the value to a dataregister: D0. That's not necessary. We could also move it to an addressreg, or to an address in memory. These are various 'ADDRESSING METHODS', and if you use the addressing methods in a clever way, your program can be much faster or better... Which are these addressing methods ? Let's explain it with an example for each one: ( '.x' means it can be anything (byte, word or longword) #x means a value like #400, #$ffa0 or #%10010101 addr means an address like $c0000 Dx means 'any dataregister' (d0 - d7) Ax means 'any addressregister' (a0 - a7) ) NORMAL ADDRESSINGS (most commonly used, don't know their name) - - - - - - - - - MOVE.x #x,addr : move a value to an addres MOVE.x addr,addr : move the contents of an address to another addres MOVE.x #x,Dx : move a value x to a dataregister MOVE.x addr,Dx : move the contents of an address to a dataregister MOVE.x Dx,addr : move the contents of a datareg into an addres MOVE.x #x,Ax : move a value to an addressregister since addresses in the AMIGA are 32 bits long, most of these kind of moves are LONG (Move.L) MOVE.x addr,Ax: move the contents of an address to an addressreg. also most times a longword. MOVE.x Ax,addr: move the contents of an addrreg into an addres MOVE.x Dx,Dy : move from one datareg into another one MOVE.x Ax,Dy or from an addressreg into a datareg MOVE.x Dx,Ay or any other combination you can think of ... INDIRECT ADDRESSING: - - - - - - - - - - MOVE.x #x,(Ax) THIS IS VERY INTERESTING !!! now you move the value 'x' not into the addressregister, (like in 'MOVE.x #x,Ax' ) but into the addres that is stored in this addressregister. For example if A0 is currently filled with the VALUE #$fc0000, MOVE.B #4,(A0) would cause the ADDRESS $fc0000 to be filled with the value #4. (This would have the same effect as MOVE.B #4,$fc0000) Now you see why A-registers are called ADDRESSregisters. The values that are stored in an ADDRESSREGISTER represent ADDRESSES. Indirect addressing can only be done with addressregisters, so MOVE.x #x,(Dx) isn't allowed. Values stored in a DATAREGISTER represent VALUES only. All combinations are allowed: move.x (Ax),Dy move.x (Ax),addr move.x (Ax),(Ay) move.x Dx,(Ay) ... You can also put an OFFSET with the (Ax). If A0 contains $10000 and you wanna put something in $10004 (which accidently is A0 + #$4) you just : MOVE.x #x,4(A0) - watch the notation: hexadecimal for example $a(A0) Don't put a '#' (wrong: #$a(A0) ) - the offset is limited (I dunno exactly how big) this would be too large, I suppose: $102447(A0) but this is still OK: $200(A0) INDIRECT ADDRESSING WITH POSTINCREMENT: - - - - - - - - - - - - - - - - - - - - example: MOVE.B #0,(A0)+ That's another VERY INTERESTING way of addressing. Let's say you wish to fill a row of addresses starting from $10000 with value 0. Then you put this address in an addressregister, and you use indirect addressing WITH 'POSTINCREMENT'. AFTER THE INSTRUCTION (in this case MOVE), the addressregister will be increased automaticly, so that it points to the next byte, word or longword (depending on the size: move.b .w or .l) let's say A0 contains #$10000 we do a MOVE.L #$2da30,(A0)+ memory will look like this: $10000 $10001 $10002 $10003 #$00 #$02 #$da #$30 and the value in A0 will be #$10004. You can't put an offset at the addressreg: WRONG: MOVE.x #x,4(A0)+ Almost the same thing is: IND.ADDR. WITH PREDECREMENT - - - - - - - - - - - - - - example: MOVE.B #0,-(A0) Here, the contents of A0 will be decreased to the first lower byte (or word or longword, depending on the size again), and THEN the move will be done. So, if A0 contains #$5c000: MOVE.W #$204a,-(A0) will have the following effect: A0 will be decreased with 2 (1 word = 2 bytes) making it #$4bffe and then #$204a will be put in locations $4bffe & $4bfff Don't mix up predecrement and negative offsets: MOVE.x #x,-(A0) ; Ind.addressing with predecrement MOVE.x #x,-4(A0) ; Ind.addressing with an offset '-4' SPECIAL EXAMPLE OF POSTINC/PREDECR - - - - - - - - - - - - - - - - - MOVEM.L D0-D7/A0-A6,-(A7) MOVEM.L (A7)+,D0-D7/A0-A6 these are 2 special instructions, as you see they use postincr and predecr. addressing. MOVEM means 'move multiple'. You decrease the value in A7, then move D0 to that address, again decrease A7, and move D1 in this addres, and so on. The second instruction gets all these values back from (A7) and puts them in the registers. This way you are able to save the contents of all the registers, and get them back after for example you performed a subroutine (where you changed them) A7 is what is called the STACKPOINTER, in it is the address of the 'STACK', a place where important values are stored as 'FIRST IN LAST OUT'. If you put D0 in it, then D1, then D2, the first value that you'll get back is D2, then D1 and then D0. This is also used for jumping to subroutines: main: bsr routine ... routine:.... ... rts The BSR will make the computer save the current address on the stack, and when the routine is finished (RTS), this address will be taken back from the stack, so the program can continue from where it left. (this is done automatically, Don't worry) ABOUT RELATIVE ADRESSING ------------------------ You surely know that AMIGA is a multitasking computer. That means that you can run more programs at 1 time, and so have more than 1 program in memory at 1 time. If you load a program, it's never sure where this program will be. If you just loaded another program, there will be no place for this second one on this place, so all depends !! A program that is loaded, first tells amiga how much memory it will need, amiga checks where he has some room, and he tells the program the start of this room. A program can therefor never say for example jump to address $10000, because he doesn't know if he will be located there. The program instead says: jump to (starting location + offset): $10000: starting location ... $12000: routine It would be something like : JMP (STARTING LOCATION + $2000) BUT !!NOT!! : JMP $12000 because when the program gets loaded an other time, the starting location could be for example $14000, putting the routine at $16000. The line 'JMP starting location + $2000' however is still valid. This has it's consequences when using asmone, or any other language on AMIGA. You can for example not put a picture on address $70000, (although many BAAAAAD coders do this) because you simply don't know if there's room on that location. (don't get it wrong: you CAN do it, but it's WRONG. The computer could crash) LABELS! - - - - In asmone, addresses are given a name: a LABEL. This is very interesting. You just give a routine a label, and if you wish to jump to it, you say "JMP labelname" instead of JMP addres. (This would be against the rules of relative addressing) When you assemble your source, asmone will look where there's room to put it, and change the labels into relative addresses. In fact YOU don't have to worry anymore. The program we saw earlier contained labels too: Top, Line1, and End are labels. asmone will calculate the correct addresses for them when you assemble it. In fact, you should replace each word 'addres' in this text with the word 'label'. For example you must MOVE.x #x,label instead of MOVE.x #x,addr MOVE.x label,Dx " " MOVE.x addr,Dx .... .... In the assembling, each command will be translated into a number which is the 'RAW' machinecode for this command. (for example: jmp will get the number #$4ef9 ) This value is stored somewhere in memory, at a certain location 'starting location + offset' Values of successive commands will be put behind eachother in memory. ************ Now the time is right to again attract your attention to 'GENERAL THINGS #4' (read this part again please) You should be able to tell me what the difference is between MOVE.L LABEL,D0 MOVE.L #LABEL,D0 But to be sure, I'll tell you: in the first case, the longword-value that is in addresses LABEL, LABEL+1, LABEL+2 and LABEL+3, will be put in D0. In the second case, the addressVALUE of LABEL will be put in D0. We don't know this address until it is assembled. some examples on this: * program: move.b data,D0 ; the CONTENTS of 'data' rts data: dc.b 10 Now, D0 will contain the byte stored at address 'data' (#10) * program: move.l #data,A0 ; the addresVALUE of 'data' move.b (A0)+,D0 move.b (A0),D1 rts data: dc.b 10,11 first we moved the address 'data' to A0. Then we moved a byte stored at (A0) to D0, A0 was increased by one BYTE, and we stored the byte at (A0) to D1. D0 will contain #10, and D1 will contain #11 Please note: if you put an address into an addresregister, like in the last example, you must use LONGWORD move (MOVE.L) because each address is 32 bits long. If you did 'MOVE.B' or 'MOVE.W' you would only have moved a part of the addres, this is not forbidden, but it was not your intention: if the addres of 'data' was $00073a00, a MOVE.W #data,A0 would have caused A0 to be filled with $3a00, which is also an address, but not the addres you wanted !!! REMARK ------ Dataregs like D0, and addressregs like A0, have a length of 32 bits, in other words: LONGWORDS. If you move a BYTE or a WORD into these registers, they won't be filled up completely. The not filled part of the register isn't changed... example: D0 contained: #$12345678 (longword) MOVE.B #$00,D0 now D0 contains: #$12345600 MOVE.W #$3333,D0 now D0 contains: #$12343333 MOVE.L #$3333,D0 now D0 contains: #$00003333 REMARK2 ------- Please refer back to General remarks #2. You'll see that a word or a longword can only start at EVEN addresses. Now look at this program: start: move.w .... ..... ; other commands ..... data: dc.b 0 ; 1 byte of data routine:move..... ; again commands.... asmone takes care that the starting of this program is at an even address, and because all commands are an even number of bytes long, all other commands will be put on even addresses too. BUT, now we put ONE byte at a certain location. The next commands will start on an ODD address, which is forbidden. One simple mistake like this could have caused a 'system crash' if you didn't have asmone. asmone ofcourse notices this mistake and says: WORD AT ODD ADDRESS. All you have to do is putting the command 'EVEN' behind the data that caused the mistake: data: dc.b 0 even routine:.... It's best to put 'EVEN' behind each row of DC.B's This is a similar mistake: (often made, often hard to find) move.l #data,A0 move.l #otherdata,A1 move.b (A0)+,(A1)+ move.w (A0)+,(A1)+ ; **** wrong !! ... data: dc.b 0,20,12,23,.... even otherdata: dc.b 0,0,0,0,.... even This program first puts the addresses of 'data' and 'otherdata' in 2 addressregs. DATA is at an even addres, correct. Then it moves a BYTE from the data-row into the 'otherdata' row. The increment from MOVE.B (A0)+,(A1)+ will add 1 to the even values in A0 and A1. Next we want to move a WORD at (A0) to a WORD at (A1), and you guessed right: it won't work... because they are now ODD, and a word can only be at an EVEN addres. another MOVE.B (A0)+,(A1)+ would cause no problem. SOME MORE COMMANDS ------------------ till now, we've only seen instruction MOVE. Here are some other ones. You can use most of the addressing methods on these commands. I don't know exactly which are allowed and which not, but you won't use most of them after all, and if you accidentally use one that isn't allowed, asmone will tell you, and it's just as soon corrected. So here we go: ADD.x a,b add a to b, result comes in b. (a and b can be #x, label, Ax, Dx, x(Ax), (Ax)+,...) SUB.x a,b subtract a from b, result in b. (idem) CMP.x a,b compare a with b, a & b can be anything except postincement or predecrement (Ax)+, -(Ax) TST.x a compare a with zero. a can't be an addresregister BSR label : branch to a subroutine. You'll get back with 'RTS' BRA label : perform an unconditional branch to another location in the program. You cannot return with RTS BNE label : branch if not equal (after a CMP or TST) BEQ label : branch if equal BLT, BLE, BGT, BGE: branch if less than, less or equal, greater than, greater or equal. SWAP Dx : swap the contents of the lower word and the higher word of a dataregister: D0: $xxxx yyyy swap d0 D0: $yyyy xxxx moving addresses to an addresregister can be done with MOVE.L #label,Ax but there's a special command to do it, it's a bit faster: LEA.L label,Ax notice that there's no # anymore, only for addr.regs !! NOT SO OFTEN USED COMMANDS - - - - - - - - - - - - - MULU a,b multiply a with b (this is a very slow command!) DIVU a,b divide b by a (idem, avoid using them!) OR.x a,b perform a logical OR with a on b AND.x a,b " " AND with a on b EOR.x a,b " " EOR with a on b NOT.x a perform NOT on a a 100101011 b 001100110 --------- a and b:000100010 the result bit will only be set if bots bits of a AND b were set or is the same but the result bit will only be set if the bit in a OR the bit in b was set: a or b: 101101111 xor is EXCLUSIVE OR. only if a bit is set in a AND NOT in b, or set in b and not in a, the result will be 1 a eor b:101001101 not moves 1 to 0 and 0 to 1: not (001010)= 110101 BTST #x,Dx : check if a bit is equal to zero (in a dataregister) D0: 10110010 01010111 10010100 00011011 ^ ^ ^ bit 31 15 0 BTST #15,D0 -> not equal !! bit 15 is set !! BCLR #x,Dx : clear a bit in a datareg BSET #x,Dx : set a bit in a datareg ASL.x #x,D/Ax : shift the bits in Dx or Ax x times to the left if the leftmost was 1, carry will be set ASR.x #x,D/Ax : shift to the right, if rightmost bit was 1, carry will be set, else, carry wil be cleared BCS, BCC : branch if carry set/clear I think these are all the commands you'll use. A complete list with full details will follow soon, but you don't need it yet. It's full with numbers and symbols, it would just make it unoverviewable. Please refer to the sources to see some real examples & experiments... asmone COMMANDS ------------- You alreqady knew it, but I say it anyway: asmone has 2 states: EDITOR STATE and ONLINE STATE... If you start Asmone, you get a 'PROMPT', it look like: 'ASM1>' You can switch between the editor and back by pressing <ESC> [..] means that this is not necessary <label> means you mustn't type 'label' but just a labelname Here's a list of online-commands: a - assemble the source j[<label>] - jump (if you give a labelname, the program will jump to the address which corresponds with the label) (you could also enter an address like $30000 instead of any label) l<string> - look for a string in the source @d<label> - disassemble a part of your assembled source, starting at <label> (press return to continue) @h<label> - show memory in ascii & hex starting from a label @m<label> - modify memory... (not sure) e - load >extern files... see sources for examples t - go to the top of the file b - bottom v[<dir>] - get the directory of the current directory if you give a dir-name, the CD will change to that directory & display the contents. v <dir> - v+space: change the current directory but don't show contents. r - read a source (or show a requester to select one) w - write source wo - write the OBJECT (the assembled version, which is executable) to disk IN EDITOR MODE: SHIFT UP/DWN- fast moving in source AMI B - start a block (to cut/copy) AMI C - copy block AMI X - cut block AMI I - insert block SHIFT Fx - MARK A POSITION (1..3) Fx - JUMP TO MARKED POSITION For more detailed info, see asmone DOCUMENTATION... ******** I think this is enough for this time... If you've read all this, you've seen nearly each aspect of assembler. I realise this is a whole bunch of information, so take your time to understand it. It's not easy, though it seems 'NORMAL' for someone who knows it. If you've read all this, it's time you'd take a look at the sources, where you can experiment and have a look at some results. The copies will follow soon, but you ABSOLUTELY don't need them yet. First get used to the the language and the characteristics of it. I hope you could understand most of this bulshit, if you don't get something, or you have questions, just ask me, and I'll try to answer... NO HAVE A LOOK AT THE SOURCES AND MAKE SOME YOURSELF !!! TRYING IS THE BEST WAY TO LEARN !!! SEE YOU SOON ! GREETINGS from Geert / EIKENLAAN 21 / 3740 BILZEN / BELGIUM